13 research outputs found

    API design for machine learning software: experiences from the scikit-learn project

    Get PDF
    Scikit-learn is an increasingly popular machine learning li- brary. Written in Python, it is designed to be simple and efficient, accessible to non-experts, and reusable in various contexts. In this paper, we present and discuss our design choices for the application programming interface (API) of the project. In particular, we describe the simple and elegant interface shared by all learning and processing units in the library and then discuss its advantages in terms of composition and reusability. The paper also comments on implementation details specific to the Python ecosystem and analyzes obstacles faced by users and developers of the library

    Adversarial Attacks on Classifiers for Eye-based User Modelling

    Full text link
    An ever-growing body of work has demonstrated the rich information content available in eye movements for user modelling, e.g. for predicting users' activities, cognitive processes, or even personality traits. We show that state-of-the-art classifiers for eye-based user modelling are highly vulnerable to adversarial examples: small artificial perturbations in gaze input that can dramatically change a classifier's predictions. We generate these adversarial examples using the Fast Gradient Sign Method (FGSM) that linearises the gradient to find suitable perturbations. On the sample task of eye-based document type recognition we study the success of different adversarial attack scenarios: with and without knowledge about classifier gradients (white-box vs. black-box) as well as with and without targeting the attack to a specific class, In addition, we demonstrate the feasibility of defending against adversarial attacks by adding adversarial examples to a classifier's training data.Comment: 9 pages, 7 figure

    API design for machine learning software: experiences from the scikit-learn project

    Full text link
    scikit-learn is an increasingly popular machine learning library. Written in Python, it is designed to be simple and efficient, accessible to non-experts, and reusable in various contexts. In this paper, we present and discuss our design choices for the application programming interface (API) of the project. In particular, we describe the simple and elegant interface shared by all learning and processing units in the library and then discuss its advantages in terms of composition and reusability. The paper also comments on implementation details specific to the Python ecosystem and analyzes obstacles faced by users and developers of the library

    Linguistically Informed Information Retrieval for Contact Center Automation

    No full text
    Customer service departments need to handle an increasing volume of textual data in the form of electronic mail. To handle this volume, some kind of automated processing is required. The aim of the research described in this thesis is to employ techniques from the fields of information retrieval (IR) and natural language processing (NLP) to automate part of the customer service pipeline.

    UvA-DARE (Digital Academic Repository) Multi-Emotion Detection in User-Generated Reviews

    No full text
    Abstract. Expressions of emotion abound in user-generated content, whether it be in blogs, reviews, or on social media. Much work has been devoted to detecting and classifying these emotions, but little of it has acknowledged the fact that emotionally charged text may express multiple emotions at the same time. We describe a new dataset of user-generated movie reviews annotated for emotional expressions, and experimentally validate two algorithms that can detect multiple emotions in each sentence of these reviews

    PCR-GLOBWB_model: eWaterCycle Development Version

    No full text
    <p>This is the alpha release of the PCR-GLOBWB model, as used in the eWaterCycle project.</p> <p>This version contains a BMI interface for PCRGlobWB, among other improvements.</p> <p>It is strongly advised to use the main version of PCRGlobWB whenever possible, as it contains multiple improvements to the model itself not in this version.</p

    PattyAnalytics

    No full text
    Patty Analytics aims to register pointclouds that were generated from photos or video to an absolute position, scale and orientation.Pointclouds generated from photos are generally messy; they have holes and floating unidentified objects. In our scripts we assume to have the following information: a map (drivemap) which has an extremely low resolution but has good absolute coordinates; a footprint polygon denoting more or less the latitude and longitude and area of the object (x and y coordinates). Finally, we have the high-resolution pointcloud of the object. By the nature of creating this pointcloud, it is densest at the object, since the photos usually center on this object. In some cases, there are also camera positions available, relative to the object.Reusable point cloud analytics software. Includes segmentation, registration, file format conversion. This makes uses of the python bindings of the Point Cloud Library (PCL).</p

    evidence

    No full text
    This release fixes a small bug in how fragments are displayed in the UI.</p

    Texcavator

    No full text
    Texcavator allows you to use full-text search on the newspaper archive of the Dutch Royal Library. On top of that, it allows for visualizations like word clouds, time lines and heat maps. It also provides services to enhance your search experience like filtering, stopword removal, normalization and stemming.</p
    corecore